Skip to content

Add support for limiting maximum execution time based on wall-time #6504

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

kocsismate
Copy link
Member

@kocsismate kocsismate commented Dec 11, 2020

On most platforms, PHP currently measures timeouts based on the CPU time, rather than wall-time. This can be fairly surprising, since neither sleep(), nor network or system calls count towards the limit of the max_execution_time config.

Unfortunately, it is not only surprising, but it can have serious consequences for distributed systems with high traffic, where finishing a request in a timely manner is essential for avoiding cascading failures.

To make things even worse, there is no easy solution for the problem at all: e.g. when using PHP-FPM as process manager, one could define the request_terminate_timeout pool-level config option to stop execution after a certain amount of time at latest. This can help, but still falls short when there is a wide variety of acceptable timeout settings for the individual scripts. So in the end, one would have to maintain pools for slow and fast scripts... Clearly, this can be easily become a big burden.

Of course, each individual network/system calls should have their own timeouts, however execution time can still easily go completely out of control when there are hundreds or even thousands of them during the same request. This also applies for CLI scripts (e.g. cron jobs) which can possibly execute millions of DB queries, although the timeout command comes in handy in this case. Another solution can be using something like if (time() - $startTime > $timeout) { die("Timeout exceeded"); }... But I don't even want to consider this idea.

HHVM solved the problem by introducing the TimeoutsUseWallTime ini setting (facebook/hhvm@9a9b42e) in order to be able to change the meaning of max_execution_time, while still (partially) remaining compatible with PHP.

Contrarily to the above, I propose adding support for a max_execution_wall_time option, so that wall-time becomes measurable as well. A limitation of the implementation is that the timeout takes into effect on a best-effort basis, only after the network/system call which exceeded the time limit is finished. I do think this is an acceptable limitation, and still much better than having almost no control of the real execution time.

@kocsismate kocsismate added the RFC label Dec 11, 2020
@kocsismate kocsismate marked this pull request as draft December 11, 2020 00:09
@nikic
Copy link
Member

nikic commented Dec 11, 2020

Not a fan of having two different max_execution_time settings that can both be active. I'd prefer the HHVM solution of toggling the meaning. I'd also be open to switching to wall-time only, which seems more useful to me, but I don't really get the technical reasons for the current choice... Maybe starting a discussion on internals will help understand why it works the way it does right now.

@kocsismate
Copy link
Member Author

Thanks for the feedback! Yeah, I'm ok with either way, I chose this solution for keeping BC completely intact, and because changing the meaning of an ini setting by another ini setting felt very controversial. In terms of the implementation, the other 2 solutions besides the current one are easier and cheaper (there is no double book keeping), so if I could choose, I'd also aim for counting the full wall-time. Unfortunately, neither can I imagine why the current solution was chosen back then.

That said, I'll start a discussion soon, I just wanted to get some early feedback here, before announcing anything. :)

@kocsismate kocsismate marked this pull request as ready for review December 26, 2020 23:26
@kocsismate
Copy link
Member Author

@nikic has the mailing list discussion changed your point of view by chance (If you followed it)? My main fear about going the Hack way is that changing only one type of timeout on the fly would be cumbersome as you would have to set both ini settings at the same time, while by having the current implementation, it's ok if you only modify the ini setting you need.

Also, I had difficulties with "aliasing" the two settings on Windows. Can you please have a technical review sometime soonish because I'd like to start the vote in the closer future.

@kocsismate
Copy link
Member Author

@nikic I added two tests to document the interaction of the the new ini setting with the hard timeout, which was pointed out by you. Unfortunately the test which is supposed to be killed by reaching hard_timeout hangs when using the built-in CLI server (at least locally).


#ifndef ZTS
if (EG(hard_timeout)) {
zend_set_wall_timeout_ex(EG(hard_timeout), 1);
Copy link
Member Author

@kocsismate kocsismate Feb 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding https://externals.io/message/112492#112783, my understanding is that the following happens:

  • when max_execution_wall_time is reached, the necessary globals are set (timed_out and vm_interrupt)
  • the timer is restarted with the hard_timeout when it has a value other than 0 (BTW: shouldn't we disallow negative values?)
  • if the timeout is reached again, the process gets aborted

What is not clear for me, why timeouts inside internal functions can cause a process abort immediately, given their individual timeout is >= hard_timeout? What am I missing?

Another thing I've just become unsure about is whether it is really a good idea to restart the real-time timer instead of the original CPU-time one:

Suggested change
zend_set_wall_timeout_ex(EG(hard_timeout), 1);
zend_set_timeout_ex(EG(hard_timeout), 1);

@kocsismate kocsismate force-pushed the max-walltime branch 2 times, most recently from ca22aad to a417436 Compare February 14, 2021 22:36
@mvorisek
Copy link
Contributor

mvorisek commented Jun 23, 2021

Some functions has no timeout option (for ex. file_get_contents called on remote FS path). It would be great if this PR will also provide a function to execute a callback with a wall-time timeout and when timeout limit will be reached, an exception will be thrown.

@rlerdorf
Copy link
Member

Maybe starting a discussion on internals will help understand why it works the way it does right now.

I wrote this initial code 25 years ago. At that time I couldn't use ITIMER_REAL because Apache used alarm() internally for timing out requests and I'd mess that up by catching the SIGALRM in PHP. SIGPROF had no such conflicts so that is why we have ITIMER_PROF today. I don't think modern Apache uses SIGALRM, so it should be possible now, but be careful of possible SAPIs that might live in an environment where you can't just steal SIGALRM.

@kocsismate kocsismate added this to the PHP 8.2 milestone Jul 25, 2021
@kocsismate
Copy link
Member Author

Thank you, @rlerdorf for this extensive background information, it was a fascinating read :) I'll try to gather more information about SAPIs with possible SIGALRM issues -- although I'm not sure it's a problem in practice since Windows (and a few more systems including IBM PASE) already only measures ITIMER_REAL.

@iluuu1994
Copy link
Member

@kocsismate I think you mentioned that you're still interested in pursing this, right? 🙂

@kocsismate
Copy link
Member Author

kocsismate commented Apr 17, 2022

Yeah, definitely! I want to first finish a few other tasks (stubs, read only classes) and I try to finish it afterwards.

P.s. i'd appreciate the constant stub review so that I can continue with the sensitive param change🙂 i'd like to avoid having to deal with conflicts

@dunglas
Copy link
Member

dunglas commented Jan 3, 2023

#10141 introduces support for timer_create() on Linux (it should also be possible to support FreeBSD in the future), which fixes the maximum runtime functionality in multithreaded environments.

Currently, the patch uses CPU-based time to be consistent with existing behavior, but perhaps - as suggested by @morrisonlevi - should we take this opportunity to switch to wall time?

In this case, compiling with --enable-zend-timers will allow switching to wall time on Linux, even on non-ZTS builds, and ZTS builds will directly use wall time (currently, you cannot use max_execution_time with ZTS because this feature is broken).

#10141 also fixes the signal conflicts reported by @rlerdorf between PHP and other tools such as old-Apache, Go, and profilers.

@adoy adoy removed this from the PHP 8.2 milestone Feb 16, 2023
@dstogov
Copy link
Member

dstogov commented Oct 9, 2023

@kocsismate I'm not sure if I should review this. Please fix the merge conflicts first or close this or "convert to draft".

@Girgias
Copy link
Member

Girgias commented Oct 9, 2023

@kocsismate I'm not sure if I should review this. Please fix the merge conflicts first or close this or "convert to draft".

@dstogov GitHub seems to have decided to flag a lot of people on old PRs

@kocsismate
Copy link
Member Author

Most likely, I'm not going to continue this RFC for the time being, so anyone can feel free to take it over.

@kocsismate kocsismate closed this Oct 9, 2023
@dstogov
Copy link
Member

dstogov commented Oct 9, 2023

@dstogov GitHub seems to have decided to flag a lot of people on old PRs

yeah. I got ~50 old PRs for review :(

@dunglas
Copy link
Member

dunglas commented Oct 9, 2023

Now that #10141 has been merged and released, it's possible to opt-in for wall-time-based timers on Linux by setting the --enable-zend-max-execution-timers configure flag. It is enabled by default for ZTS builds in 8.3, but not (yet) for non-ZTS builds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants